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ABSTRACTED-PUB-NO: RD 343006A 
BASIC-ABSTRACT: 

A hidden field containing the bucket number and a hidden field giving 
a row ' s 

address or ID (RID) can be exploited for accessing rows to be moved. 
An index 

on the bucket number (BNUM) hidden field is created and maintained. 
The bucket 

number refers to the result of hashing a partitioning key value in 
the rows of 

the table, determining which partition each row of the table will be 
placed in. 

There can be many more buckets than nodes, each node receiving the 
rows of 

several buckets. The index maps BNUM to RID for each row in the 
partitioned 
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table . 

To access all the rows belonging to a given bucket or a given set of 
buckets, 

the index is searched using the given bucket number (s) as search 
key(s) . The 

database management system's index manager is used for these scans. 
The RIDs 

that are returned from the scan should be sorted, for the most 
efficient 

retrieval. This can be done by using the database management 
system's sort 

services in the normal manner, with the RID column being the sort 
key . The 

sorted list of RIDs is then consulted, and each row pointed to by 
each RID is 

retrieved, using the database management system's record retrieval 
service . 

USE/ ADVANTAGE - To reorganise or add nodes to a partitioned database. 
Good 

performance, simplicity and concurrency. An index lookup prevents 
need to 

access rows that are not in buckets of concern, and sorting RIDs ■ 
allows 

efficient retrieval of rows (clusters I/Os so same page need not be 
read more 

than once). Uses existing services, i.e. a typical database 
management system 

already has components that perform index lookups, sorts, and record 
retrieval . 

No need to access and therefore lock rows that are^ not affected. 
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Star join operation method in relational database management system, 
involves selecting row and column from star map using hash row value and 
accessing fact table accordingly 
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Inventor: KOSTAMAA O P; RAMESH B; BHASHYAM R 
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Abstract (Basic) : EP 1148430 A2 

NOVELTY - A cross-product is generated from dimension tables 
referenced by star join, and join column are hashed to create a 
hash-row value. A portion of hash row value is used to select a star 
map row and its another portion is used to select column of selected 
row. A fact table is accessed to join with cross-product when 
selected column indicates that the record exist in fact table. 

DETAILED DESCRIPTION - INDEPENDENT CLAIMS are also included for the 
following : 

(a) Computer implemented system for performing star join; 

(b) Data structure; 

(c) Computer program for performing star join 

USE - For performing star join operation in relational database 
management system ( RDBMS ) of computer system such as mainframe, micro 
computer or personal computer system. 

ADVANTAGE - Since hash-row value is used for addressing star map, 
the size of map is maintained constant, thus improves the performance 
of star joins. 

DESCRIPTION OF DRAWING (S) - The figure shows a hardware and 
software environment for performing star join, 
pp; 12 DwgNo 1/5 
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Processor controlled index generation for relational database - by 
analysing workload of all requests in system, assigning value of 
importance to each request, breaking request into expressions, contexts 
and columns to ease identification of candidate indexes 

Patent Assignee: ORACLE CORP (ORAC-N) 

Inventor: PANT S; SMITH G S 

Number of Countries: 001 Number of Patents: 001 
Patent Family: 

Patent No Kind Date Applicat No Kind Date Week 
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Priority Applications (No Type Date) : US 92886751 A 19920521 
Patent Details: 
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Abstract (Basic) : US 5404510 A 

The indexes are generated by the processor establishing indexes to 
provide efficient system operation. The processor looks at the logical 
scheme and workload of the database . The indexes are generated (48) 
by identifying importance values for the individual database 
requests. They are designed by identifying new indexes by order of 
request importance. Existing indexes are reused and modified based on 
the match between existing indexes (54). Previously identified indexes 
are searched for a similar index to each candidate index. 

The importance value of the request is identified from expected 
frequency of processing and from individual importance of the request. 
Columns and associated operators for the individual table contexts 
are identified in each expression of each request. The candidate 
indexes are identified from the columns for individual contexts by 
order of request value of importance. 

USE /ADVANTAGE - For SQL. Reduces time to locate records by using 
indexes, cheaper to scan indexes than tables , uses hash indexing by 
applying system hash to column data serving as key to index to 
locate page of index, more direct access to individual records. 
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Data file for organizing information in persistent computer storage 
device, has equally sized data frames into which records comprising key 
length/key data field and record length/data field are stored 
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NOVELTY - The data file has equally sized data frames into which 
records are stored. A file header comprises a frame size field , a 
hash type field and a modulo field which indicates the number of 
hash buckets to be used by the hashing algorithm indicated by the hash 
type field . The records have a key length field, a key data field, a 
record data field and a record length field. 

DETAILED DESCRIPTION - INDEPENDENT CLAIMS are also included for the 
following : 

(1) a method for storing records in persistent storage; and 

(2) a method for accessing a record in data file stored in 
persistent storage. 

USE - For organizing information in persistent computer storage 
device. As a multidimensional database. 

ADVANTAGE - Allows record of unlimited dimensions containing data 
of any type and size, in any combination, to be constructed, maintained 
and utilized in the persistent storage. The hashing algorithm is used 
to balance the distribution of records into the data frames, 
thereby minimizing the access time to any one record. 

DESCRIPTION OF DRAWING (S) - The figure shows a schematic view of a 
multi-dimensional data set. 
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Title Terms: DATA; FILE; ORGANISE; INFORMATION; PERSISTENT; COMPUTER; 

STORAGE; DEVICE; EQUAL; SIZE; DATA; FRAME; RECORD; COMPRISE; KEY; LENGTH; 

KEY; DATA; FIELD; RECORD; LENGTH; DATA; FIELD; STORAGE 
Derwent Class: T01 

International Patent Class (Main) : G06F-017/30 
File Segment: EPI 



28/5/2 (Item 2 from file: 347) 

DIALOG ( R) File 347:JAPIO 

(c) 2006 JPO & JAPIO. All rts. reserv. 

06915216 **Image available** 
DATABASE MANAGING METHOD 



PUB. NO.: 
PUBLISHED: 
INVENTOR (s) 



APPLICANT (s) 

APPL. NO. : 
FILED: 
INTL CLASS: 



2001-142752 [JP 2001142752 
May 25, 2001 (20010525) 
KAWAMURA NOBUO 
HOSHINO RYUICHI 
KASAO HIDEAKI 
HITACHI LTD 

HITACHI SOFTWARE ENG CO LTD 
11-323657 [JP 99323657] 
November 15, 1999 (19991115) 
G06F-012/00 ; G06F-017/30 



A] 



ABSTRACT 



PROBLEM TO BE SOLVED: To solve the problem that data need to be rearranged 
in an added database storage area when the database storage area is added 
as the database capacity increases. 

SOLUTION: When (m) database storage areas are given as storage areas of a 
database, one or more data items of the data base are used as 

partitioning keys and the database is divided into (n) (m^n) 

packets as logical units by applying a hash function to the partitioning 
keys ; and the database is managed by using a hash map table which 
determines the correspondence of the data base storage areas managing the 
respective buckets according to the number of the given database storage 
areas and a segment hash map table for mapping distributed buckets with 
segments as storage units in the respective database storage areas. 



COPYRIGHT: (C) 2001, JPO 



28/5/5 (Item 5 from file: 347) 

DIALOG (R) File 347:JAPIO 

(c) 2006 JPO & JAPIO. All rts. reserv. 



06665097 **Image available** 

METHOD AND SYSTEM FOR MANAGING DATABASE 



APPLICANT (s) : 
APPL. NO. : 
FILED: 
INTL CLASS: 



PUB. NO. : 
PUBLISHED: 
INVENTOR (s) : 



2000-250921 [JP 2000250921 A] 
September 14, 2000 (20000914) 
GOSHO HIROYUKI 
KIMURA KOJI 
KONDO YOICHI 
HARA KIYONOBU 
TAKAZAWA KAZUYUKI 
HITACHI LTD 

11-049664 [JP 9949664] 
February 26, 1999 (19990226) 
G06F-017/30 ; G06F-012/00 



ABSTRACT 



PROBLEM TO BE SOLVED: To perform grouping operation which requires sorting 
by a grouping column value at high speed by providing a means for 
extracting records from a data base and grouping them by using a hash 

method and a means for sorting the grouped records . 
SOLUTION: In hash grouping 102, ungrouped records are take out the data 
base, one by one, a hash value is obtained from a hash function 103 by 
using a grouping column value as a key, and grouped records are generated 
in a hash grouping area 104 and related in sorting order . In grouped 
record extraction 106, the grouped records are taken out of the hash 
grouping area 104 in the sorting order, which is always held. In sort 
grouping 107, ungrouped records stored in a work file 105 are sorted by 
the grouping column value to generate grouped records. 

COPYRIGHT: (C) 2000, JPO 
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ABSTRACT 

PURPOSE: To suppress the increase of the access frequency to an external 
storage and also to suppress the delay of the searching speed for a 1st 
file by setting the capacity of a unit information store area added with an 
index at the element number value larger than the maximum quantity of 
information that can be read into the external storage with a single 
access . 



CONSTITUTION: A hash address (h) is obtained from a record key K, and a 
2nd file VI is checked for reading an index (p) out of the address (h) . 
Then the record stored in a 2nd file V is shecked based on the index (p) . 
Thus both files V and VI are checked independently of each other and 
therefore at least two accesses are needed to an external storage. Under 
such conditions, the synonyms are stored contiguous to each other in the 
file V and the synonyms equal to the maximum quantity of information that 
can be read with a single access given to the external storage can be read 
en bloc. Thus it is possible to suppress the increase of the access 
frequency to the external storage and also to suppress the delay of the 
searching speed. 
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ABSTRACT 

improve the processing efficiency in a joint process by 
the number of tuples at the side having the smaller number of 
sorts with the hash packets at both sites according to the result 
hash sorting. 



CONSTITUTION: When a hash processing part 3 of a processor 1 is started, 
the tuples of the desired relation are read successively outpof a data 
base. These tuples are sorted to one of hash packets 11-0-11-p 
corresponding to the hash function valve of the prescribed join field 
value. then the number of tuples are informed to a short informing part 4. 
The part 4 transfers the number of tuples to a communication processing 
part 2 and an array comparison part 5. The part 5 receives the number of 
hash sorts from the site at the remote side and compares them with those 
at the own site. Then the part 5 transmits the tapples sorted to the 
packets even with the packet having a small number of hash sorts of the 
own site. The result of said comparison is sent to a joint processing part 
6 for execution of the joint processing with the tuples received from the 
remote side for the packet whose joint processing should be applied at the 
own site. 
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Records distribution determination method for computer system, 

involves locating marker data records in database and analyzing position 

of adjacent actual data records 
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Abstract (Basic) : US 6560599 Bl 

NOVELTY - The actual data records (1) and marker data record (7) 
which include pointer to logically preceding and proceeding actual data 
records, are inserted in a threaded linear hash table (4) using hash 

function. A known key of marker data record is hashed to locate 
marker data records in database and analyze position of adjacent actual 
data records using hash function for determining distribution of 
data records . 

DETAILED DESCRIPTION - INDEPENDENT CLAIMS are also included for the 
following : 

(1) records distribution determination system; and 

(2) computer readable media storing records distribution 
determination program. 

USE - For determining distribution of marker records in hash 
table used in various storage devices of computer system including 
personal computer and servers in local area network (LAN) , wide area 
network (WAN) like Internet. 

ADVANTAGE - The distribution of records in hash table is 
determined efficiently by locating marker data records in database and 
analyzing position of adjacent actual data records , thereby 
efficiently assessing distribution of records , optimizing quantity 
of data stored in memory and adjusting or tuning hash function to 
increase density of records around a given mark. 

DESCRIPTION OF DRAWING (S) - The figure shows the hash table. 

actual data records (1) 
hash table (4) 

marker data records (7) 
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NOVELTY - The actual and marker data records having a logical 
ordering specified by the keys, are inserted in a hash table using a 
hash function. The keys of marker data records are distributed at 
known positions throughout the range of keys of actual data records. 
One of the keys of the marker data records is hashed to locate the 
associated record in the hash table, if no record exists in the hash 
table for the input key. 

DETAILED DESCRIPTION - An INDEPENDENT CLAIM is included for 
computer readable recorded medium storing data record locating program. 

USE - For locating data record in computer system using hash 
table. 

ADVANTAGE - Efficiently performs sequential linear access and other 
operations on logically ordered data stored in non-logical order in 
hash table. 

DESCRIPTION OF DRAWING (S) - The figure shows the flowchart 
explaining data record locating process, 
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Transforming records to be hashed stored in storage form processing 
system - uses group-by operation execution device for reading list of 
hashed records output to storage device by output device and sorting 
hashed records in list according to key value 
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Abstract (Basic) : EP 877324 A 

The system includes a record storing device (106) for temporarily 
storing the records. A pointer storing device (107) is used for storing 
pointers to the records in the record storing device (106) at 
positions. Each of the latter corresponds to a hash function value 
calculated using the key value of the pointed record. An output device 
(108) outputs the records pointed to by the pointers stored in the 
pointer storing device (107) to the storage device, given the hash 
function values for storage positions of the pointers. 

A group-by operation execution device (109) is used for reading a 
list of hashed records output to the storage device by the output 
device (108), sorting the hashed records in the list according to 
the key value, and performing the group-by operation on the list of the 
sorted records . 

USE - For processing large amount of data stored in database for 
obtaining rule of relationship among data stored in database.: 

ADVANTAGE - Provides high speed using result of hash process. 
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Accessing rows based on hidden field values - creating an index on bucket 
number hidden field and mapping to each row address in partitioned 
table 
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Abstract (Basic) : RD 343006 A 

A hidden field containing the bucket number and a hidden field 
giving a row 1 s address or ID (RID) can be exploited for accessing rows 
to be moved. An index on the bucket number (BNUM) hidden field is 
created and maintained. The bucket number refers to the result of 
hashing a partitioning key value in the rows of the table , 
determining which partition each row of the table will be placed in. 
There can be many more buckets than nodes, each node receiving the rows 
of several buckets. The index maps BNUM to RID for each row in the 
partitioned table . 

To access all the rows belonging to a given bucket or a given set 
of buckets, the index is searched using the given bucket number (s) as 
search key (s). The database management system ! s index manager is used 
for these scans. The RIDs that are returned from the scan should be 
sorted, for the most efficient retrieval. This can be done by using the 
database management system's sort services in the normal manner, with 
the RID column being the sort key . The sorted list of RIDs is then 
consulted, and each row pointed to by each RID is retrieved, using the 
database management system's record retrieval service. 

USE /ADVANTAGE - To reorganise or add nodes to a partitioned 
database . Good performance, simplicity and concurrency. An index 
lookup prevents need to access rows that are not in buckets of concern, 
and sorting RIDs allows efficient retrieval of rows (clusters I/Os so 
same page need not be read more than once) . Uses existing services, 
i.e. a typical database management system already has components that 
perform index lookups, sorts , and record retrieval. No need to 
access and therefore lock rows that are not affected. 
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Abstract (Basic) : EP 350208 A 

A real-time database comprises data storage routines, data 
retrieval routines, data updating routines, and an index hashing 
mechanism, for storing, searching and retrieving tuples in data tables, 
and for storing and retrieving unformatted data in input areas. The 
data retrieval routines include a routine include a routine to directly 
access to data using tuple identifiers, and a routine to directly 
access unformatted data from input areas. The data retrieval routines 
for accessing tuples in data tables include an option to 
read-through-lock to access tuples in locked data tables. 

The data updating routines include an option to omit index updating 
when updating data and an option to update data in a locked data table. 
The data storage routines and the data retrieval routines are 
independent and use has index tables to relatee an index key to an 
entry in the data table, so that multiple indexes can be defined for a 
data table. The data table structure includes a column defined for 
storing tuple identifier strings. 

ADVANTAGE - Provides high speed data access required for on-line 
applications. 
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Abstract (Basic) : EP 72413 A 

A controller monitors the controls of a cache containing copies of 
a variable subset of the records in backing storage together with a 
directory for locating the same, and for observing a hashing protocol 
in a hash mechanism. The backing storage is modular and has 
functional discontinuity between modules, each module having a multiple 
record capacity, with the expectancy of certain known preferred modules 
being more frequently accessed than other modules. 

The directory is maintained in the form of independent strings of 
entries incorporating link elements and a table is maintained mapping 
the classes defined under the hashing protocol onto the directory 
strings. The hashing mechanism accesses the table to access the 
entries of a string sequentially in link order to a match or to the end 
of the string. The hashing mechanism maps records onto table entries 
such that, given a table entry order , sequential records within a 
module map onto sequential table entries, and no record in a preferred 
module maps onto the same table entry as does any record on any other 
preferred module. 
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The recent advances in parallel and distributed processing and its 
applications to database operations such as join have initiated extensive 
research in this field . Investigations on hash based join algorithms 
compared to methods such as join-index join and merge-sort have given 
encouraging results. However, they involve a costly data partitioning phase 
prior to the join. This costly partitioning phase can be avoided if file 
structures that keep data already partitioned in the secondary storage 
are used. Interpolation Based Grid File (IBGF) is such a file structure. In 
this thesis new join algorithms for parallel computers for relations based 
on IBGF are investigated. Different algorithms are used for uniform 
relations and nonuniform relations. The efficiencies of these algorithms 
based on relations and architecture, have been studied using simulation. 
However the comparison of different techniques is not within the scope of 
this work. 
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Abstract: Shared nothing multiprocessor architecture is known to be more 
scalable to support very large databases. Compared to other join 
strategies, a hash-based join algorithm is particularly efficient and 
easily parallelized for this computation model. However, this hardware 
structure is very sensitive to the skew in tuple distribution. Unless the 
parallel hash join algorithm includes some dynamic load balancing 
mechanism, the skew effect can severely deteriorate the system performance. 
In this paper, we investigate this issue. In particular, three parallel 
hash join algorithms are presented. We implement a simulator to study the 
effectiveness of these schemes. The simulation model is validated by 
comparing the simulation results to those produced by the actual 
implementation of the algorithms running on a multiprocessor system. Our 
performance study indicates that a naive approach is not able to provide 
tangible savings. However, the carefully designed strategies can offer 
substantial improvement over conventional techniques for a wide range of 
skew conditions. (35 Refs) 
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Abstract: Trie hashing (TH) , a primary key access method for 

storing and accessing records of dynamic files, is discussed. The key 
address is computed through a trie. A key search usually requires only one 
disk access when the trie is in core and two disk accesses for very large 
files when the trie must be on disk. A refinement to trie hashing, trie 
hashing with controlled load (THCL) , is presented. It is designed to 
control the load factor of a TH file as tightly as that of a B-tree file, 
allows high load factor of up to 100% for ordered insertions, and increases 
the load factor for random insertions from 70% to over 85%. It is shown 
that these properties make trie hashing preferable to a B-tree. (29 Refs) 
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Abstract: An aid is described which can assist the data base 
administrator in selecting optimally the storage structures for a data 
base. This data base has a network structure according to the CODASYL 
specifications. It is assumed that the administrator can define record and 
set types. It is also assumed that the data base system offers the 
following options, which may be selected by the data base administrator: an 
index can be defined on each attribute of a record type; non-singular sets 
can be implemented by means of pointer-arrays which are stored either 
adjacent to the owner record or are connected to this record by means of a 
pointer; and hashing on the primary key can be used to store and 
retrieve records and if hashing is not used then the clustering of those 
records on attribute value is possible. In order to select the storage 
structures a data base specification and load definition language is 
needed. The chosen object function is the minimization of the number of 
page accesses. (9 Refs) 
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Abstract: Studies file designs for answering partial-match queries for 
dynamic files. A partial-match query is a specification of the value of 
zero or more fields in a record. An answer to query consists of a listing 
of all records in the file satisfying the values specified. The main 
contribution is a general method whereby certain primary key hashing 
schemes can be extended to partial-match retrieval schemes. These 
partial-match retrieval designs can handle arbitrarily dynamic files and 
can be optimized with respect to the number of page faults required to 
answer a query. The authors illustrate the method by considering in detail 
the extension of two recent dynamic primary key hashing schemes. (11 
Ref s) 
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Abstract: Database operating efficiency in relational databases can be 
improved significantly by using proper storage schema. Two such methods to 
improve performance are attribute partitioning and record clustering. In 
the past, the two design problems were treated separately-the optimal 
attribute partitioning was obtained without considering the order in 
which tuples are laid out; this does not produce a true optimal solution. 
The paper provides experimental evidence that the best attribute 
partitioning scheme is dependent on the tuple ordering and vice versa. 
The combined optimal solution of attribute partitioning and tuple 

ordering obtained in the experiments showed an improvement in database 
operating efficiency ranging from 12% to 35%, compared to the independent 
optimal solutions of attribute partitioning and tuple ordering 

considered separately. (6 Refs) 
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